NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Explainable multi-task learning for multi-modality biological data analysis

https://doi.org/10.1038/s41467-023-37477-x

Tang, Xin; Zhang, Jiawei; He, Yichun; Zhang, Xinhe; Lin, Zuwan; Partarrieu, Sebastian; Hanna, Emma Bou; Ren, Zhaolin; Shen, Hao; Yang, Yuhong; et al (May 2023, Nature Communications)

Abstract Current biotechnologies can simultaneously measure multiple high-dimensional modalities (e.g., RNA, DNA accessibility, and protein) from the same cells. A combination of different analytical tasks (e.g., multi-modal integration and cross-modal analysis) is required to comprehensively understand such data, inferring how gene regulation drives biological diversity and functions. However, current analytical methods are designed to perform a single task, only providing a partial picture of the multi-modal data. Here, we present UnitedNet, an explainable multi-task deep neural network capable of integrating different tasks to analyze single-cell multi-modality data. Applied to various multi-modality datasets (e.g., Patch-seq, multiome ATAC + gene expression, and spatial transcriptomics), UnitedNet demonstrates similar or better accuracy in multi-modal integration and cross-modal prediction compared with state-of-the-art methods. Moreover, by dissecting the trained UnitedNet with the explainable machine learning algorithm, we can directly quantify the relationship between gene expression and other modalities with cell-type specificity. UnitedNet is a comprehensive end-to-end framework that could be broadly applicable to single-cell multi-modality biology. This framework has the potential to facilitate the discovery of cell-type-specific regulation kinetics across transcriptomics and other modalities.
more » « less
Full Text Available
Targeted Cross-Validation

Zhang, Jiawei; Ding, Jie; Yang, Yuhong (January 2022, Bernoulli)

Full Text Available
Combining forecasts for universally optimal performance

https://doi.org/10.1016/j.ijforecast.2021.05.004

Qian, Wei; Rolling, Craig A.; Cheng, Gang; Yang, Yuhong (January 2022, International Journal of Forecasting)

Full Text Available
Parallel Assisted Learning

https://doi.org/10.1109/TSP.2022.3229637

Wang, Xinran; Zhang, Jiawei; Hong, Mingyi; Yang, Yuhong; Ding, Jie (January 2022, IEEE Transactions on Signal Processing)

Full Text Available
Information criteria for model selection

https://doi.org/10.1002/wics.1607

Zhang, Jiawei; Yang, Yuhong; Ding, Jie (February 2023, WIREs Computational Statistics)

Abstract The rapid development of modeling techniques has brought many opportunities for data‐driven discovery and prediction. However, this also leads to the challenge of selecting the most appropriate model for any particular data task. Information criteria, such as the Akaike information criterion (AIC) and Bayesian information criterion (BIC), have been developed as a general class of model selection methods with profound connections with foundational thoughts in statistics and information theory. Many perspectives and theoretical justifications have been developed to understand when and how to use information criteria, which often depend on particular data circumstances. This review article will revisit information criteria by summarizing their key concepts, evaluation metrics, fundamental properties, interconnections, recent advancements, and common misconceptions to enrich the understanding of model selection in general. This article is categorized under:Data: Types and Structure > Traditional Statistical DataStatistical Learning and Exploratory Methods of the Data Sciences > Modeling MethodsStatistical and Graphical Methods of Data Analysis > Information Theoretic MethodsStatistical Models > Model Selection
more » « less
Is a Classification Procedure Good Enough?—A Goodness-of-Fit Assessment Tool for Classification Learning

https://doi.org/10.1080/01621459.2021.1979010

Zhang, Jiawei; Ding, Jie; Yang, Yuhong (January 2021, Journal of the American Statistical Association)

Full Text Available
On the Forecast Combination Puzzle

https://doi.org/10.3390/econometrics7030039

Qian, Wei; Rolling, Craig A.; Cheng, Gang; Yang, Yuhong (September 2019, Econometrics)

It is often reported in the forecast combination literature that a simple average of candidate forecasts is more robust than sophisticated combining methods. This phenomenon is usually referred to as the “forecast combination puzzle”. Motivated by this puzzle, we explore its possible explanations, including high variance in estimating the target optimal weights (estimation error), invalid weighting formulas, and model/candidate screening before combination. We show that the existing understanding of the puzzle should be complemented by the distinction of different forecast combination scenarios known as combining for adaptation and combining for improvement. Applying combining methods without considering the underlying scenario can itself cause the puzzle. Based on our new understandings, both simulations and real data evaluations are conducted to illustrate the causes of the puzzle. We further propose a multi-level AFTER strategy that can integrate the strengths of different combining methods and adapt intelligently to the underlying scenario. In particular, by treating the simple average as a candidate forecast, the proposed strategy is shown to reduce the heavy cost of estimation error and, to a large extent, mitigate the puzzle.
more » « less
Full Text Available

Search for: All records